General-Purpose MCMC Inference over Relational Structures
نویسندگان
چکیده
Tasks such as record linkage and multi-target tracking, which involve reconstructing the set of objects that underlie some observed data, are particularly challenging for probabilistic inference. Recent work has achieved efficient and accurate inference on such problems using Markov chain Monte Carlo (MCMC) techniques with customized proposal distributions. Currently, implementing such a system requires coding MCMC state representations and acceptance probability calculations that are specific to a particular application. An alternative approach, which we pursue in this paper, is to use a general-purpose probabilistic modeling language (such as BLOG) and a generic Metropolis-Hastings MCMC algorithm that supports user-supplied proposal distributions. Our algorithm gains flexibility by using MCMC states that are only partial descriptions of possible worlds; we provide conditions under which MCMC over partial worlds yields correct answers to queries. We also show how to use a context-specific Bayes net to identify the factors in the acceptance probability that need to be computed for a given proposed move. Experimental results on a citation matching task show that our general-purpose MCMC engine compares favorably with an application-specific system.
منابع مشابه
Approximate inference for first-order probabilistic languages
A new, general approach is described for approximate inference in first-order probabilistic languages, using Markov chain Monte Carlo (MCMC) techniques in the space of concrete possible worlds underlying any given knowledge base. The simplicity of the approach and its lazy construction of possible worlds make it possible to consider quite expressive languages. In particular, we consider two ext...
متن کاملA General Method for Reducing the Complexity of Relational Inference and its Application to MCMC
Many real-world problems are characterized by complex relational structure, which can be succinctly represented in firstorder logic. However, many relational inference algorithms proceed by first fully instantiating the first-order theory and then working at the propositional level. The applicability of such approaches is severely limited by the exponential time and memory cost of propositional...
متن کاملScalable Probabilistic Databases with Factor Graphs and MCMC
Incorporating probabilities into the semantics of incomplete databases has posed many challenges, forcing systems to sacrifice modeling power, scalability, or treatment of relational algebra operators. We propose an alternative approach where the underlying relational database always represents a single world, and an external factor graph encodes a distribution over possible worlds; Markov chai...
متن کاملSound and Efficient Inference with Probabilistic and Deterministic Dependencies
Reasoning with both probabilistic and deterministic dependencies is important for many real-world problems, and in particular for the emerging field of statistical relational learning. However, probabilistic inference methods like MCMC or belief propagation tend to give poor results when deterministic or near-deterministic dependencies are present, and logical ones like satisfiability testing a...
متن کاملBayesian Optimization of Partition Layouts for Mondrian Processes
The Mondrian process (MP) produces hierarchical partitions on a product space as a kd-tree, which can be served as a flexible yet parsimonious partition prior for relational modeling. Due to the recursive generation of partitions and varying dimensionality of the partition state space, the inference procedure for the MP relational modeling is extremely difficult. The prevalent inference method ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1206.6849 شماره
صفحات -
تاریخ انتشار 2006